Semantic And Discourse Information For Text-To-Speech Intonation

نویسندگان

Laurie Hiyakumoto

Scott Prevost

Justine Cassell

چکیده

Concept-to-Speech (CTS) systems, which aim to synthesize speech from semantic information and discourse context, have succeeded in producing more appropriate and naturalsounding prosody than text-to-speech (TTS) systems, which rely mostly on syntactic and orthographic information. In this paper, we show how recent advances in CTS systems can be used to improve intonation in text reading systems for English. Specifically, following (Prevost, 1995; Prevost, 1996), we show how information structure is used by our program to produce intonational patterns with context-appropriate variation in pitch accent type and prominence. Following (Cahn, 1994; Cahn, 1997), we also show how some of the semantic information used by such CTS systems can be drawn from WordNet (Miller et al., 1993), a large-scale semantic lexicon.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Higher Level Organization and Discourse Prosody

This paper addresses higher level organization in discourse prosody. Fluent speech prosody of text reading illustrated higher level speech planning above phrases and prosody segments above intonation units. Adopting a top-down perspective allowed clearer reflection of scope and unit involved. We examined large amount of speech data via a corpus approach, studied read discourse through perceived...

متن کامل

FORM: An Extensible, Kinematically-based Gesture Annotation Scheme

Annotated corpora have played a critical role in speech and natural language research; and, there is an increasing interest in corpora-based research in sign language and gesture as well. We present a non-semantic, geometricallybased annotation scheme, FORM, which allows an annotator to capture the kinematic information in a gesture just from videos of speakers. In addition, FORM stores this ge...

متن کامل

Learning Intonation Rules for Concept to Speech Generation

In this paper, we report on an effort to provide a general-purpose spoken language generation tool for Concept-to-Speech (CTS) applications by extending a widely used text generation package, FUF/SURGE, with an intonation generation component. As a first step, we applied machine learning and statistical models to learn intonation rules based on the semantic and syntactic information typically r...

متن کامل

Representing Discourse Information for Spoken Dialogue Generation

Prosody and intonation convey important distinctions of “Information Structure”, marking portions of the utterance as standing in relations to the surrounding discourse such as “theme” and “rheme”, and marking relations of contrast between referring expressions and potential reference sets. The use of default intonation contours in standard “text-to-speech” applications can be quite successful,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Semantic And Discourse Information For Text-To-Speech Intonation

نویسندگان

چکیده

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

Higher Level Organization and Discourse Prosody

FORM: An Extensible, Kinematically-based Gesture Annotation Scheme

Learning Intonation Rules for Concept to Speech Generation

Representing Discourse Information for Spoken Dialogue Generation

عنوان ژورنال:

اشتراک گذاری